IN THE CLAIMS 



1 (Original). A method comprising: 

receiving a first program unit in a parallel computing environment having a team 
of parallel threads including at least a first and second thread, the first program unit including a 
memory copy operation to be performed between the first thread and the second thread; and 

translating the first program unit into a second program unit, the second program 
unit to associate the memory copy operation with a set of one or more instructions, the set of 
instructions to ensure that the second thread copies data based, in part, on a first descriptor 
associated with the first thread. 

2 (Currently Amended). The method of claim 1 further comprising copying m the 
address of the first descriptor to a buffer and copying data into a memory area associated with 
the second thread based, in part, on address and data information associated with the first 
descriptor. 

3 (Original). The method of claim 2 further comprising copying data into a memory 
area associated with second thread utilizing, in part, a second descriptor associated with the 
second thread. 

4 (Original). The method of claim 1 further comprising enabling the first thread to copy 
an address of the first descriptor to a buffer and setting a signal to enable the second thread to 
copy data associated with the first descriptor to a memory area associated with the second thread. 

5 (Original). The method of claim 4 further comprising enabling the first thread to enter 
a wait state after the signal is set. 

6 (Original). The method of claim 5 further comprising releasing the first thread from a 
wait state upon completion of the data copy operation by the second thread. 
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7 (Original). The method of claim 5 further comprising enabhng the first thread to copy 
an address of the first descriptor to one of two buffer areas. 

8 (Original). The method of claim 1 further comprising receiving the first program unit 
in source code format and translating the first program unit into a second program unit in source 
code format. 

9 (Original). A machine-readable medium that provides instructions, that when 
executed by a machine, enables the machine to perform operations comprising: 

receiving a first program unit in a parallel computing environment, the first 
program unit including a memory copy operation to be performed between a first thread in a 
team of threads and a second thread in the team of threads; and 

translating the first program unit into a second program unit, the second program 
unit to associate the memory copy operation with a set of one or more instructions, the set of 
instructions to ensure that the second thread copies data based, in part, on a first descriptor 
associated with the first thread. 

10 (Currently Amended). The machine-readable medium of claim 9, further 
comprising copying an fee address of the first descriptor to a buffer and copying data into a 
memory area associated with the second thread based, in part, on address and data information 
associated with the first descriptor. 

1 1 (Original). The machine-readable mediimi of claim 10, further comprising copying 
data into a memory area associated with second thread based utihzing, in part, a second 
descriptor associated with the second thread. 

12 (Original). The machine-readable medium of claim 9, further comprising enabling the 
first thread to copy an address of the first descriptor to a buffer and setting a signal to enable the 
second thread to copy data associated with the first descriptor to a memory area associated with 
the second thread. 
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13 (Original). The machine-readable medium of claim 12, further comprising enabling 
the first thread to enter a wait state after the signal is set. 

14 (Original). The machine-readable medium of claim 13, further comprising releasing 
the first thread from a wait state upon completion of the data copy operation by the second 
thread. 

15 (Original). The machine-readable medium of claim 13, further comprising enabling 
the first thread to copy an address of the first descriptor to one of two buffer areas. 

16 (Original). The machine-readable medium of claim 12, further comprising copying 
data into a memory area associated with second thread utihzing, in part, a second descriptor 
associated with the second thread. 

17 (Original). The machine-readable medium of claim 9 further comprising receiving the 
first program unit in source code format and translating the first program unit into the second 
program unit in source code format. 

1 8 (Currently Amended). A method comprising: 

receiving a first program unit in a parallel computing environment and translating 
the first program unit, in part, into one or more computer instructions, the instructions enabling a 
second thread in a team of threads to copy data, into a memory area associated with the second 
thread, from a private memory area associated with a first thread; and 

copying m fee address of a descriptor into a buffer utilized by the second thread, 
in part, to copy data from the memory area associated with the first thread. 

19 (Original). The method of claim 18, further comprising creating a descriptor utilized, 
in part, by the second thread to copy data into the memory area associated with the second 
thread. 
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20 (Original). The method of claim 19, further comprising setting a signal by the first 
thread enabling the second thread to copy the data firom the memory area associated with the first 
thread. 

21 (Original). The method of claim 20, further comprising entering a wait state by the 
first thread until the second thread copies the data firom the memory area associated with the first 
thread. 

22 (Original). An apparatus comprising: 

a memory including a shared memory location; and 

a translation unit coupled with the memory, the translation unit operative to 
associate a first program unit, including a memory copy operation to be performed between a 
first thread in a team of threads and a second thread in the team of threads, with a set of one or 
more instructions, the set of instructions to ensure that the second thread copies data based, in 
part, on a first descriptor associated with the first thread. 

23 (Currently Amended). The apparatus as in claim 22 wherein an the address of the 
first descriptor is copied to a buffer by the first thread and the second thread copies data into a 
memory area associated with the second thread based, in part, on address and data information 
associated with the first descriptor. 

24 (Original). The apparatus as in claim 23 wherein the second thread copies data into a 
memory area associated with the second thread utilizing, in part, a second descriptor associated 
with the second thread. 

25 (Original). The apparatus as in claim 22 wherein the first thread copies an address of 
the first descriptor to a buffer and sets a signal to enable the second thread to copy data 
associated with the first descriptor to a memory area associated with the second thread. 

26 (Original). The apparatus as in claim 25 wherein the first thread enters a wait state 
after the signal is set. 
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27 (Original). The apparatus of claim 26, wherein the first thread exits the wait state after 
completion of the data copy by the second thread. 

28 (Original). The apparatus of claim 22 wherein the first program unit is in source code 

format, 

29 (Original). The apparatus of claim 28 wherein the first descriptor is passed to the first 
program unit. 

30 (Original). The apparatus as in claim 22 wherein the translation unit translates the 
first program unit, in part, into a second program unit in source code format and the second 
program unit includes the memory copy operation. 
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